Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a simple cache to the ros3 VFD #3753

Merged
merged 2 commits into from
Oct 23, 2023

Conversation

derobins
Copy link
Member

Adds a small cache of the first N bytes of a file opened with the read-only S3 (ros3) VFD, where N is 4kiB or the size of the file, whichever is smaller. This avoids a lot of small I/O operations on file open.

Addresses GitHub issue #3381

Adds a small cache of the first N bytes of a file opened with the
read-only S3 (ros3) VFD, where N is 4kiB or the size of the file,
whichever is smaller. This avoids a lot of small I/O operations
on file open.

Addresses GitHub issue HDFGroup#3381
@derobins derobins added Merge - To 1.14 Priority - 1. High 🔼 These are important issues that should be resolved in the next release Component - C Library Core C library issues (usually in the src directory) Type - Improvement Improvements that don't add a new feature or functionality labels Oct 23, 2023
for (bin_i = 0; bin_i < ROS3_STATS_BIN_COUNT; bin_i++)
if ((unsigned long long)size < ros3_stats_boundaries[bin_i])
break;
bin = (type == H5FD_MEM_DRAW) ? &file->raw[bin_i] : &file->meta[bin_i];
Copy link
Member Author

@derobins derobins Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff just got moved over because it's in an else clause now. I figured we only wanted to track stats when we did an actual read.

@jhendersonHDF
Copy link
Collaborator

It would be good to address #3381's point about allowing the cache size to be configured at runtime at some point in the future before closing that issue.

@derobins derobins merged commit d76d591 into HDFGroup:develop Oct 23, 2023
jhendersonHDF pushed a commit to jhendersonHDF/hdf5 that referenced this pull request Oct 24, 2023
Adds a small cache of the first N bytes of a file opened with the
read-only S3 (ros3) VFD, where N is 4kiB or the size of the file,
whichever is smaller. This avoids a lot of small I/O operations
on file open.

Addresses GitHub issue HDFGroup#3381
lrknox pushed a commit that referenced this pull request Oct 25, 2023
* Add missing test files to distclean target (#3734)

Cleans up new files in Autotools `make distclean` in the test directory

* Add tools/libtest to Autotools builds (#3735)

This was only added to CMake many years ago and tests the tools
library.

* Clean up onion VFD files in tools `make clean` (#3739)

Cleans up h5dump and h5diff *.onion files in the Autotools when
runing `make clean`.

* Clean Java test files on Autotools (#3740)

Removes generated HDF5 and text output files when running `make clean`.

* Clean the flushrefresh test dir on Autotools (#3741)

The flushrefresh_test directory was not being cleaned up w/
`make clean` under the Autotools

* Fix file names in tfile.c (#3743)

Some tests in tfile.c use h5_fileaccess to get a VFD-dependent file
name but use the scheme from testhdf5, reusing the FILE1 and FILE8
names. This leads to files like test1.h5.h5 which are unintended
and not cleaned up.

This changes the filename scheme for a few tests to work with h5test,
resulting in more informative names and allowing the files to
be cleaned up at the end of the test. The test files have also
been added to the `make clean` target for the Autotools.

* Clean Autotools files in parallel tests (#3744)

Adds missing files to `make clean` for parallel, including Fortran.

* Add native VOL checks to deprecated functions (#3647)

* Add native VOL checks to deprecated functions

* Remove unneeded native VOL checks

* Move native checks to top level calls

* Fix buffer overflow in cache debugging code (#3691)

* update stat arg for apple (#3726)

* update stat arg for apple

* use H5_HAVE_DARWIN for Apple ifdef

* fix typo

* removed duplicate H5_ih_info_t

* added fortran async test to cmake

* Fix windows cpack error in WiX package. (#3747)

* Add a simple cache to the ros3 VFD (#3753)

Adds a small cache of the first N bytes of a file opened with the
read-only S3 (ros3) VFD, where N is 4kiB or the size of the file,
whichever is smaller. This avoids a lot of small I/O operations
on file open.

Addresses GitHub issue #3381

* Update Autotools to correctly configure oneAPI (#3751)

* Update Autotools to correctly configure oneAPI

Splits the Intel config files under the Autotools into 'classic'
Intel and oneAPI versions, fixing 'unsupported option' messages.

Also turns off `-check uninit` (new in 2023) in Fortran, which kills
the H5_buildiface program due to false positives.

* Enable Fortran in oneAPI CI workflow

* Turn on Fortran in CMake, update LD_LIBRARY_PATH

* Go back to disabling Fortran w/ Intel

For some reason there's a linking problem w/ Fortran

error while loading shared libraries: libifport.so.5: cannot open shared object file: No such file or directory

* Add h5pget_actual_selection_io_mode fortran wrapper (#3746)

* added h5pget_actual_selection_io_mode_f test

* added tests for h5pget_actual_selection_io_mode_f

* fixed int_f type conversion

* Update fortran action step (#3748)

* Added missing DLL for H5PGET_ACTUAL_SELECTION_IO_MODE_F (#3760)

* add missing H5PGET_ACTUAL_SELECTION_IO_MODE_F dll

* Bump the ros3 VFD cache to 16 MiB (#3759)

* Fix hangs during collective I/O with independent metadata writes (#3693)

* Fix some issues with collective metadata reads for chunked datasets (#3716)

Add functions/callbacks for explicit control over chunk index open/close

Add functions/callbacks to check if chunk index is open or not so
that it can be opened if necessary before temporarily disabling
collective metadata reads in the library

Add functions/callbacks for requesting loading of additional chunk
index metadata beyond the chunk index itself

* Fix failure in t_select_io_dset when run with more than 10 ranks (#3758)

* Fix H5Pset_evict_on_close failing regardless of actual parallel use (#3761)

Allow H5Pset_evict_on_close to be called regardless of whether a parallel build of HDF5 is being used

Fail during file opens if H5Pset_evict_on_close has been set to true on the given File Access Property List and the size of the MPI communicator being used is greater than 1
@derobins derobins deleted the simple_ros3_vfd_cache branch March 27, 2024 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Priority - 1. High 🔼 These are important issues that should be resolved in the next release Type - Improvement Improvements that don't add a new feature or functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants